Findings:
Based on the absolute difference of population between 2010 and 2020 in Santa Clara, there seems to be lots of small pockets of large population change. Most of the Santa Clara area experienced very little population change. However, given the population difference, we don’t have a full picture of what exactly this means- we don’t know the breakdown of migration, births, and deaths so we can’t make a clear conclusion of how much the actual communities changed. Another key finding is my learning curve with R. I had never used R before and thought there was a significant gap between the chapters/in-class R sessions and the assignment. Because of that, I spent a lot of time trying to learn how to combine data tables and manipulate them. This was an insightful R assignment but further in-class/online chapter support would be appreciated for future iterations of this course/future assignments.
Key Assumptions:
Using the data we found on the census, 2010 had more data points than 2020. There may be a variety of reasons for this- ranging from covid-related reasons to changes in how data was collected to changes in neighborhood boundaries. For the purposes of presenting a visual of the difference between 2010 and 2020, I supplemented missing data and “NAs” with a zero value. This isn’t entirely accurate of the reality of population change but it gives a sense of what the data shows and where the gap between data collected and the reality of the situation. Another assumption is the validity, completeness, and accuracy of the data set used. This data set is gathered and produced by the US Census, so we are assuming it is from a credible source on the topic and was gathered in a fair and unbiased way.